github ggml-org/whisper.cpp v1.8.0

6 hours ago

Overview

  • Flash attention is now enabled by default
  • Performance improvements

M1 Pro

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M1 Pro METAL tiny 1 0 32.44 1.71 0.43 0.04 8a67c55
M1 Pro METAL base 1 0 63.54 2.62 0.71 0.06 8a67c55
M1 Pro METAL small 1 0 200.30 5.34 1.72 0.17 8a67c55
M1 Pro METAL medium 1 0 580.06 11.71 4.18 0.45 8a67c55
CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M1 Pro METAL tiny 1 1 22.09 1.84 0.43 0.03 8a67c55
M1 Pro METAL base 1 1 40.57 2.22 0.44 0.04 8a67c55
M1 Pro METAL small 1 1 135.15 4.23 0.95 0.12 8a67c55
M1 Pro METAL medium 1 1 395.18 9.14 2.21 0.30 8a67c55

M2 Ultra

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 ULTRA METAL tiny 1 0 8.63 1.09 0.27 0.01 b57b9d3
M2 ULTRA METAL tiny-q5_0 1 0 9.04 1.06 0.28 0.01 b57b9d3
M2 ULTRA METAL tiny-q5_1 1 0 8.98 1.06 0.28 0.01 b57b9d3
M2 ULTRA METAL tiny-q8_0 1 0 8.69 1.06 0.27 0.01 b57b9d3
M2 ULTRA METAL base 1 0 15.39 1.54 0.43 0.02 b57b9d3
M2 ULTRA METAL base-q5_0 1 0 16.50 1.50 0.42 0.02 b57b9d3
M2 ULTRA METAL base-q5_1 1 0 16.45 1.49 0.43 0.02 b57b9d3
M2 ULTRA METAL base-q8_0 1 0 15.62 1.51 0.42 0.02 b57b9d3
M2 ULTRA METAL small 1 0 45.99 2.99 0.90 0.05 b57b9d3
M2 ULTRA METAL small-q5_0 1 0 50.65 2.98 0.92 0.06 b57b9d3
M2 ULTRA METAL small-q5_1 1 0 50.74 2.96 0.92 0.06 b57b9d3
M2 ULTRA METAL small-q8_0 1 0 47.16 2.83 0.89 0.06 b57b9d3
M2 ULTRA METAL medium 1 0 132.78 6.46 2.02 0.13 b57b9d3
M2 ULTRA METAL medium-q5_0 1 0 149.35 6.11 2.09 0.14 b57b9d3
M2 ULTRA METAL medium-q5_1 1 0 149.11 6.09 2.11 0.14 b57b9d3
M2 ULTRA METAL medium-q8_0 1 0 137.37 6.05 2.03 0.13 b57b9d3
M2 ULTRA METAL medium-dis 1 0 121.60 0.90 0.25 0.02 b57b9d3
M2 ULTRA METAL large-v2 1 0 231.19 9.40 3.10 0.22 b57b9d3
M2 ULTRA METAL large-v2-q5_0 1 0 265.90 8.98 3.11 0.25 b57b9d3
M2 ULTRA METAL large-v2-q5_1 1 0 265.18 8.92 3.13 0.25 b57b9d3
M2 ULTRA METAL large-v2-q8_0 1 0 240.23 9.06 2.98 0.23 b57b9d3
M2 ULTRA METAL large-v2-dis 1 0 210.25 0.99 0.28 0.02 b57b9d3
M2 ULTRA METAL large-v3-turbo 1 0 211.72 1.52 0.46 0.03 b57b9d3
M2 ULTRA METAL large-v3-turbo-q5_0 1 0 242.17 1.40 0.47 0.04 b57b9d3
M2 ULTRA METAL large-v3-turbo-q8_0 1 0 219.75 1.40 0.45 0.04 b57b9d3
CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 ULTRA METAL tiny 1 1 6.28 0.96 0.22 0.01 a77d11d
M2 ULTRA METAL tiny-q5_0 1 1 6.69 0.92 0.22 0.01 a77d11d
M2 ULTRA METAL tiny-q5_1 1 1 6.67 0.91 0.22 0.01 a77d11d
M2 ULTRA METAL tiny-q8_0 1 1 6.34 0.92 0.21 0.01 a77d11d
M2 ULTRA METAL base 1 1 10.77 1.30 0.32 0.02 a77d11d
M2 ULTRA METAL base-q5_0 1 1 11.84 1.23 0.33 0.02 a77d11d
M2 ULTRA METAL base-q5_1 1 1 11.95 1.24 0.33 0.02 a77d11d
M2 ULTRA METAL base-q8_0 1 1 11.14 1.23 0.32 0.02 a77d11d
M2 ULTRA METAL small 1 1 32.12 2.43 0.65 0.04 a77d11d
M2 ULTRA METAL small-q5_0 1 1 36.95 2.42 0.68 0.04 a77d11d
M2 ULTRA METAL small-q5_1 1 1 37.40 2.42 0.68 0.04 a77d11d
M2 ULTRA METAL small-q8_0 1 1 33.48 2.30 0.65 0.04 a77d11d
M2 ULTRA METAL medium 1 1 89.28 5.05 1.46 0.09 a77d11d
M2 ULTRA METAL medium-q5_0 1 1 105.24 4.89 1.48 0.11 a77d11d
M2 ULTRA METAL medium-q5_1 1 1 105.28 4.98 1.49 0.11 a77d11d
M2 ULTRA METAL medium-q8_0 1 1 93.61 4.89 1.43 0.10 a77d11d
M2 ULTRA METAL medium-dis 1 1 78.44 0.81 0.20 0.01 a77d11d
M2 ULTRA METAL large-v2 1 1 165.69 7.50 2.16 0.17 a77d11d
M2 ULTRA METAL large-v2-q5_0 1 1 199.40 7.37 2.18 0.20 a77d11d
M2 ULTRA METAL large-v2-q5_1 1 1 199.29 7.37 2.21 0.20 a77d11d
M2 ULTRA METAL large-v2-q8_0 1 1 174.60 6.87 2.16 0.18 a77d11d
M2 ULTRA METAL large-v2-dis 1 1 145.80 0.90 0.22 0.02 a77d11d
M2 ULTRA METAL large-v3-turbo 1 1 146.98 1.31 0.34 0.03 a77d11d
M2 ULTRA METAL large-v3-turbo-q5_0 1 1 176.77 1.19 0.35 0.03 a77d11d
M2 ULTRA METAL large-v3-turbo-q8_0 1 1 154.73 1.20 0.33 0.03 a77d11d

M4 Max

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M4 Max METAL tiny 1 0 10.51 0.86 0.23 0.01 47fcd7d
M4 Max METAL tiny-q8_0 1 0 10.73 0.84 0.24 0.01 47fcd7d
M4 Max METAL base 1 0 19.50 1.34 0.36 0.02 47fcd7d
M4 Max METAL base-q8_0 1 0 20.17 1.25 0.36 0.02 47fcd7d
M4 Max METAL small 1 0 61.91 2.77 0.78 0.06 47fcd7d
M4 Max METAL small-q8_0 1 0 64.17 2.43 0.78 0.06 47fcd7d
M4 Max METAL medium 1 0 181.50 6.44 1.85 0.15 47fcd7d
M4 Max METAL medium-q8_0 1 0 187.71 5.80 1.84 0.15 47fcd7d
M4 Max METAL large-v2 1 0 335.49 10.49 3.01 0.26 47fcd7d
M4 Max METAL large-v2-q8_0 1 0 349.89 8.65 2.97 0.27 47fcd7d
M4 Max METAL large-v3-turbo 1 0 301.34 1.83 0.49 0.04 47fcd7d
CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M4 Max METAL tiny 1 1 8.23 0.71 0.16 0.01 47fcd7d
M4 Max METAL tiny-q8_0 1 1 8.47 0.67 0.16 0.01 47fcd7d
M4 Max METAL base 1 1 15.47 1.12 0.26 0.02 47fcd7d
M4 Max METAL base-q8_0 1 1 15.70 1.05 0.27 0.02 47fcd7d
M4 Max METAL small 1 1 49.82 2.37 0.53 0.05 47fcd7d
M4 Max METAL small-q8_0 1 1 51.76 1.99 0.53 0.05 47fcd7d
M4 Max METAL medium 1 1 147.76 5.52 1.27 0.12 47fcd7d
M4 Max METAL medium-q8_0 1 1 153.98 4.59 1.24 0.13 47fcd7d
M4 Max METAL large-v2 1 1 282.89 9.06 2.11 0.22 47fcd7d
M4 Max METAL large-v2-q8_0 1 1 296.43 7.44 2.09 0.23 47fcd7d
M4 Max METAL large-v3-turbo 1 1 249.91 1.65 0.38 0.04 47fcd7d

RTX 5090

GPU Config Model Th FA Enc. Dec. Bch5 PP Commit
RTX 5090 CUDA tiny 1 0 2.06 0.55 0.13 0.00 e4bf87b
RTX 5090 CUDA tiny-q8_0 1 0 2.50 0.55 0.14 0.01 e4bf87b
RTX 5090 CUDA base 1 0 3.72 0.81 0.19 0.01 e4bf87b
RTX 5090 CUDA base-q8_0 1 0 4.35 0.79 0.20 0.01 e4bf87b
RTX 5090 CUDA small 1 0 11.24 1.55 0.38 0.02 e4bf87b
RTX 5090 CUDA small-q8_0 1 0 12.69 1.69 0.40 0.02 e4bf87b
RTX 5090 CUDA medium 1 0 31.16 3.19 0.79 0.04 e4bf87b
RTX 5090 CUDA medium-q8_0 1 0 32.74 3.43 0.80 0.05 e4bf87b
RTX 5090 CUDA large-v2 1 0 50.09 4.55 1.14 0.05 e4bf87b
RTX 5090 CUDA large-v2-q8_0 1 0 52.44 4.76 1.11 0.07 e4bf87b
RTX 5090 CUDA large-v3-turbo 1 0 46.78 0.70 0.17 0.01 e4bf87b
RTX 5090 CUDA large-v3-turbo-q8_0 1 0 48.57 0.70 0.16 0.01 e4bf87b
GPU Config Model Th FA Enc. Dec. Bch5 PP Commit
RTX 5090 CUDA tiny 1 1 1.39 0.47 0.11 0.00 e4bf87b
RTX 5090 CUDA tiny-q8_0 1 1 1.83 0.48 0.12 0.01 e4bf87b
RTX 5090 CUDA base 1 1 2.17 0.70 0.16 0.01 e4bf87b
RTX 5090 CUDA base-q8_0 1 1 2.78 0.68 0.17 0.01 e4bf87b
RTX 5090 CUDA small 1 1 5.02 1.33 0.32 0.01 e4bf87b
RTX 5090 CUDA small-q8_0 1 1 6.39 1.46 0.34 0.02 e4bf87b
RTX 5090 CUDA medium 1 1 13.89 2.68 0.64 0.03 e4bf87b
RTX 5090 CUDA medium-q8_0 1 1 15.40 2.92 0.67 0.04 e4bf87b
RTX 5090 CUDA large-v2 1 1 21.24 3.88 0.96 0.04 e4bf87b
RTX 5090 CUDA large-v2-q8_0 1 1 23.54 4.01 0.93 0.05 e4bf87b
RTX 5090 CUDA large-v3-turbo 1 1 18.18 0.62 0.15 0.01 e4bf87b
RTX 5090 CUDA large-v3-turbo-q8_0 1 1 19.89 0.61 0.14 0.01 e4bf87b

What's Changed

New Contributors

Full Changelog: v1.7.6...v1.8.0

Don't miss a new whisper.cpp release

NewReleases is sending notifications on new releases.