07 Oct 10:09

ebca09a

v1.7.1 Latest

Latest

Overview

Fix Vulkan crashes
Performance stats for Vulkan on RTX 2060

GPU	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
RTX 2060	VULKAN	tiny	1	30.38	1.37	1.04	0.05	`9f346d0`
RTX 2060	VULKAN	tiny-q5_0	1	20.98	1.38	0.99	0.05	`9f346d0`
RTX 2060	VULKAN	tiny-q5_1	1	20.74	1.30	0.96	0.05	`9f346d0`
RTX 2060	VULKAN	base	1	44.69	1.59	1.78	0.09	`9f346d0`
RTX 2060	VULKAN	base-q5_0	1	39.72	2.11	1.72	0.08	`9f346d0`
RTX 2060	VULKAN	base-q5_1	1	39.45	2.01	1.63	0.08	`9f346d0`
RTX 2060	VULKAN	small	1	160.02	3.53	4.64	0.23	`9f346d0`
RTX 2060	VULKAN	small-q5_0	1	141.52	4.54	4.44	0.20	`9f346d0`
RTX 2060	VULKAN	small-q5_1	1	141.03	4.63	4.18	0.20	`9f346d0`
RTX 2060	VULKAN	medium	1	472.66	7.55	11.35	0.56	`9f346d0`
RTX 2060	VULKAN	medium-q5_0	1	395.55	9.81	10.64	0.49	`9f346d0`
RTX 2060	VULKAN	medium-q5_1	1	398.85	10.16	10.15	0.50	`9f346d0`
RTX 2060	VULKAN	medium-dis	1	427.26	1.26	1.20	0.08	`9f346d0`
RTX 2060	VULKAN	large-v2	1	924.60	12.36	18.56	1.01	`9f346d0`
RTX 2060	VULKAN	large-v2-q5_0	1	774.21	17.25	17.17	0.85	`9f346d0`
RTX 2060	VULKAN	large-v2-q5_1	1	779.75	17.44	16.27	0.85	`9f346d0`
RTX 2060	VULKAN	large-v2-dis	1	833.35	1.38	1.56	0.10	`9f346d0`
RTX 2060	VULKAN	large-v3-turbo	1	839.90	2.11	2.70	0.16	`9f346d0`
RTX 2060	VULKAN	large-v3-turbo-q5_0	1	705.49	3.22	2.53	0.14	`9f346d0`

What's Changed

Retry allocation with fallback flags by @SRHMorris in #2451

New Contributors

@SRHMorris made their first contribution in #2451

Full Changelog: v1.7.0...v1.7.1

Binaries

https://github.com/ggerganov/whisper.cpp/actions/runs/11213279590

Contributors

SRHMorris

Assets 2

05 Oct 14:15

ggerganov

v1.7.0

6a94163

v1.7.0

Overview

Fix crashes with high number of beams
Reduce overal VRAM usage
Optimize Encoder performance

Some performance numbers for this release:

M2 Ultra

Flash Attention ON:

GPU	Config	Model	Th	FA	Enc.	Dec.	Bch5	PP	Commit
M2 Ultra	METAL	tiny	1	1	8.37	1.44	0.48	0.01	`6a94163`
M2 Ultra	METAL	tiny-q5_0	1	1	9.81	1.46	0.50	0.01	`6a94163`
M2 Ultra	METAL	tiny-q5_1	1	1	8.80	1.47	0.50	0.01	`6a94163`
M2 Ultra	METAL	base	1	1	16.11	1.96	0.74	0.02	`6a94163`
M2 Ultra	METAL	base-q5_0	1	1	16.38	1.99	0.78	0.02	`6a94163`
M2 Ultra	METAL	base-q5_1	1	1	16.72	2.00	0.77	0.02	`6a94163`
M2 Ultra	METAL	small	1	1	41.26	3.88	1.66	0.05	`6a94163`
M2 Ultra	METAL	small-q5_0	1	1	46.91	4.02	1.76	0.06	`6a94163`
M2 Ultra	METAL	small-q5_1	1	1	47.05	4.00	1.73	0.06	`6a94163`
M2 Ultra	METAL	medium	1	1	111.29	7.79	3.63	0.11	`6a94163`
M2 Ultra	METAL	medium-q5_0	1	1	129.78	7.71	3.85	0.13	`6a94163`
M2 Ultra	METAL	medium-q5_1	1	1	129.29	7.71	3.87	0.13	`6a94163`
M2 Ultra	METAL	medium-dis	1	1	99.27	1.09	0.43	0.02	`6a94163`
M2 Ultra	METAL	large-v2	1	1	198.81	11.54	5.59	0.20	`6a94163`
M2 Ultra	METAL	large-v2-q5_0	1	1	236.18	11.12	6.11	0.24	`6a94163`
M2 Ultra	METAL	large-v2-q5_1	1	1	235.88	11.14	6.01	0.24	`6a94163`
M2 Ultra	METAL	large-v2-dis	1	1	177.41	1.21	0.48	0.02	`6a94163`
M2 Ultra	METAL	large-v3-turbo	1	1	178.92	1.89	0.83	0.03	`6a94163`
M2 Ultra	METAL	large-v3-turbo-q5_0	1	1	211.44	1.73	0.90	0.04	`6a94163`

Flash Attention OFF:

GPU	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
M2 Ultra	METAL	tiny	1	10.04	1.37	0.50	0.01	`6a94163`
M2 Ultra	METAL	tiny-q5_0	1	10.02	1.36	0.53	0.01	`6a94163`
M2 Ultra	METAL	tiny-q5_1	1	11.08	1.37	0.53	0.01	`6a94163`
M2 Ultra	METAL	base	1	17.84	1.93	0.77	0.02	`6a94163`
M2 Ultra	METAL	base-q5_0	1	18.57	1.92	0.81	0.02	`6a94163`
M2 Ultra	METAL	base-q5_1	1	18.66	1.93	0.82	0.02	`6a94163`
M2 Ultra	METAL	small	1	48.26	3.95	1.73	0.05	`6a94163`
M2 Ultra	METAL	small-q5_0	1	53.68	3.99	1.85	0.06	`6a94163`
M2 Ultra	METAL	small-q5_1	1	53.86	4.00	1.82	0.06	`6a94163`
M2 Ultra	METAL	medium	1	130.09	8.01	3.82	0.13	`6a94163`
M2 Ultra	METAL	medium-q5_0	1	148.18	7.92	4.11	0.14	`6a94163`
M2 Ultra	METAL	medium-q5_1	1	147.95	7.94	4.11	0.14	`6a94163`
M2 Ultra	METAL	medium-dis	1	116.97	1.11	0.42	0.02	`6a94163`
M2 Ultra	METAL	large-v2	1	232.43	12.34	5.87	0.22	`6a94163`
M2 Ultra	METAL	large-v2-q5_0	1	269.72	11.68	6.44	0.26	`6a94163`
M2 Ultra	METAL	large-v2-q5_1	1	269.71	11.82	6.36	0.26	`6a94163`
M2 Ultra	METAL	large-v2-dis	1	209.25	1.25	0.48	0.02	`6a94163`
M2 Ultra	METAL	large-v3-turbo	1	211.09	1.98	0.84	0.03	`6a94163`
M2 Ultra	METAL	large-v3-turbo-q5_0	1	244.23	1.81	0.92	0.04	`6a94163`

Ryzen 9 5950X + RTX 2060

Flash Attention ON:

GPU	Config	Model	Th	FA	Enc.	Dec.	Bch5	PP	Commit
RTX 2060	AVX2 CUDA	tiny	1	1	7.35	0.78	0.24	0.01	`6a94163`
RTX 2060	AVX2 CUDA	tiny-q5_0	1	1	6.45	0.67	0.14	0.01	`6a94163`
RTX 2060	AVX2 CUDA	tiny-q5_1	1	1	6.39	0.66	0.14	0.01	`6a94163`
RTX 2060	AVX2 CUDA	base	1	1	10.20	0.88	0.30	0.01	`6a94163`
RTX 2060	AVX2 CUDA	base-q5_0	1	1	11.38	0.92	0.21	0.02	`6a94163`
RTX 2060	AVX2 CUDA	base-q5_1	1	1	11.76	0.91	0.20	0.02	`6a94163`
RTX 2060	AVX2 CUDA	small	1	1	33.06	2.00	0.56	0.03	`6a94163`
RTX 2060	AVX2 CUDA	small-q5_0	1	1	35.84	1.84	0.43	0.04	`6a94163`
RTX 2060	AVX2 CUDA	small-q5_1	1	1	36.89	1.82	0.42	0.04	`6a94163`
RTX 2060	AVX2 CUDA	medium	1	1	90.65	4.54	1.13	0.08	`6a94163`
RTX 2060	AVX2 CUDA	medium-q5_0	1	1	104.01	3.80	0.91	0.10	`6a94163`
RTX 2060	AVX2 CUDA	medium-q5_1	1	1	107.98	3.72	0.87	0.10	`6a94163`
RTX 2060	AVX2 CUDA	medium-dis	1	1	79.08	0.68	0.17	0.01	`6a94163`
RTX 2060	AVX2 CUDA	large-v2	1	1	162.00	7.52	1.92	0.14	`6a94163`
RTX 2060	AVX2 CUDA	large-v2-q5_0	1	1	184.59	5.64	1.50	0.16	`6a94163`
RTX 2060	AVX2 CUDA	large-v2-q5_1	1	1	193.85	5.55	1.44	0.17	`6a94163`
RTX 2060	AVX2 CUDA	large-v2-dis	1	1	140.75	0.84	0.37	0.02	`6a94163`
RTX 2060	AVX2 CUDA	large-v3-turbo	1	1	143.38	1.29	0.36	0.02	`6a94163`
RTX 2060	AVX2 CUDA	large-v3-turbo-q5_0	1	1	163.30	0.93	0.28	0.03	`6a94163`

Flash Attention OFF:

GPU	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
RTX 2060	AVX2 CUDA	tiny	1	12.49	0.87	0.23	0.01	`6a94163`
RTX 2060	AVX2 CUDA	tiny-q5_0	1	10.65	0.78	0.19	0.02	`6a94163`
RTX 2060	AVX2 CUDA	tiny-q5_1	1	10.82	0.77	0.19	0.02	`6a94163`
RTX 2060	AVX2 CUDA	base	1	18.97	1.04	0.34	0.02	`6a94163`
RTX 2060	AVX2 CUDA	base-q5_0	1	20.22	1.09	0.27	0.02	`6a94163`
RTX 2060	AVX2 CUDA	base-q5_1	1	20.48	1.07	0.27	0.02	`6a94163`
RTX 2060	AVX2 CUDA	small	1	59.52	2.37	0.70	0.05	`6a94163`
RTX 2060	AVX2 CUDA	small-q5_0	1	62.98	2.23	0.60	0.06	`6a94163`
RTX 2060	AVX2 CUDA	small-q5_1	1	63.64	2.21	0.59	0.06	`6a94163`
RTX 2060	AVX2 CUDA	medium	1	161.53	5.36	1.53	0.13	`6a94163`
RTX 2060	AVX2 CUDA	medium-q5_0	1	174.96	4.64	1.32	0.15	`6a94163`
RTX 2060	AVX2 CUDA	medium-q5_1	1	178.42	4.57	1.29	0.15	`6a94163`
RTX 2060	AVX2 CUDA	medium-dis	1	149.65	0.75	0.20	0.02	`6a94163`
RTX 2060	AVX2 CUDA	large-v2	1	280.55	8.74	2.51	0.23	`6a94163`
RTX 2060	AVX2 CUDA	large-v2-q5_0	1	306.87	6.92	2.08	0.25	`6a94163`
RTX 2060	AVX2 CUDA	large-v2-q5_1	1	314.25	6.82	2.02	0.26	`6a94163`
RTX 2060	AVX2 CUDA	large-v2-dis	1	259.39	0.91	0.37	0.02	`6a94163`
RTX 2060	AVX2 CUDA	large-v3-turbo	1	261.83	1.44	0.41	0.04	`6a94163`
RTX 2060	AVX2 CUDA	large-v3-turbo-q5_0	1	282.99	1.09	0.33	0.04	`6a94163`

Vulkan:

GPU	Config	Model	Th	FA	Enc.	Dec.	Bch5	PP	Commit
RTX 2060	VULKAN	tiny	1	0	30.38	1.37	1.04	0.05	`9f346d0`
RTX 2060	VULKAN	tiny-q5_0	1	0	20.98	1.38	0.99	0.05	`9f346d0`
RTX 2060	VULKAN	tiny-q5_1	1	0	20.74	1.30	0.96	0.05	`9f346d0`
RTX 2060	VULKAN	base	1	0	44.69	1.59	1.78	0.09	`9f346d0`
RTX 2060	VULKAN	base-q5_0	1	0	39.72	2.11	1.72	0.08	`9f346d0`
RTX 2060	VULKAN	base-q5_1	1	0	39.45	2.01	1.63	0.08	`9f346d0`
RTX 2060	VULKAN	small	1	0	160.02	3.53	4.64	0.23	`9f346d0`
RTX 2060	VULKAN	small-q5_0	1	0	141.52	4.54	4.44	0.20	`9f346d0`
RTX 2060	VULKA...

Contributors

philn, jart, and 25 other contributors

Assets 2

0 Join discussion

27 May 07:36

ggerganov

v1.6.2

c7b6988

v1.6.2

Overview

Bugfix when using multiple whisper_state in parallel: #2182

What's Changed

Update ruby bindings by @taf2 in #2154
Update server.cpp by @dvaldivia in #2181
Revert "whisper : remove extra backend instance (huh?)" by @ggerganov in #2182

New Contributors

@dvaldivia made their first contribution in #2181

Full Changelog: v1.6.1...v1.6.2

Contributors

taf2, ggerganov, and dvaldivia

Assets 2

21 May 15:46

ggerganov

v1.6.1

c10db6e

v1.6.1

Minor release adding initial ffmpeg support in the examples #2133 (thx @WilliamTambellini)

What's Changed

ci: Update build.yml to suppress warnings about node.js versions by @tamo in #2166
node : add flash_attn param by @pprobst in #2170
Add support for decoding input with ffmpeg (Linux) by @WilliamTambellini in #2133

New Contributors

@WilliamTambellini made their first contribution in #2133

Full Changelog: v1.6.0...v1.6.1

Contributors

WilliamTambellini, tamo, and pprobst

Assets 2

15 May 07:13

ggerganov

v1.6.0

08981d1

v1.6.0

Overview

Can optionally enable Flash Attention for faster processing on CUDA and Metal devices (#2152)
Faster ppc64 performance (40aeeee) (not tested)
Fix main slowdown bug (#2070)

Shoutout to @JohannesGaessler for contributing efficient FA CUDA kernels

Some performance numbers for this release:

M1 Pro

CPU	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
M1 Pro	METAL	tiny	1	39.21	1.74	0.61	0.04	`22c96b4`
M1 Pro	METAL	base	1	70.76	2.60	0.93	0.06	`22c96b4`
M1 Pro	METAL	small	1	217.28	6.42	2.14	0.17	`22c96b4`
M1 Pro	METAL	medium	1	596.74	14.43	4.75	0.45	`22c96b4`

CPU	Config	Model	Th	FA	Enc.	Dec.	Bch5	PP	Commit
M1 Pro	METAL	tiny	1	1	30.77	1.59	0.54	0.03	`22c96b4`
M1 Pro	METAL	base	1	1	60.42	2.29	0.81	0.05	`22c96b4`
M1 Pro	METAL	small	1	1	183.82	5.12	1.81	0.14	`22c96b4`
M1 Pro	METAL	medium	1	1	517.92	11.60	4.01	0.38	`22c96b4`

M2 Ultra

CPU	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
M2 ULTRA	METAL	tiny	1	12.32	1.35	0.49	0.01	`22c96b4`
M2 ULTRA	METAL	tiny-q5_0	1	11.65	1.30	0.51	0.01	`22c96b4`
M2 ULTRA	METAL	tiny-q5_1	1	12.08	1.30	0.51	0.01	`22c96b4`
M2 ULTRA	METAL	base	1	17.58	1.90	0.76	0.02	`22c96b4`
M2 ULTRA	METAL	base-q5_0	1	18.89	1.86	0.79	0.02	`22c96b4`
M2 ULTRA	METAL	base-q5_1	1	20.69	1.88	0.79	0.02	`22c96b4`
M2 ULTRA	METAL	small	1	49.32	3.85	1.71	0.05	`22c96b4`
M2 ULTRA	METAL	small-q5_0	1	54.91	3.81	1.82	0.06	`22c96b4`
M2 ULTRA	METAL	small-q5_1	1	54.92	3.81	1.79	0.06	`22c96b4`
M2 ULTRA	METAL	medium	1	134.34	8.04	3.82	0.13	`22c96b4`
M2 ULTRA	METAL	medium-q5_0	1	151.68	7.59	4.07	0.14	`22c96b4`
M2 ULTRA	METAL	medium-q5_1	1	151.58	7.67	4.07	0.14	`22c96b4`
M2 ULTRA	METAL	medium-dis	1	120.82	1.07	0.41	0.02	`22c96b4`
M2 ULTRA	METAL	large-v2	1	235.63	12.27	5.85	0.22	`22c96b4`
M2 ULTRA	METAL	large-v2-q5_0	1	273.38	11.17	6.40	0.26	`22c96b4`
M2 ULTRA	METAL	large-v2-q5_1	1	272.44	11.32	6.29	0.26	`22c96b4`
M2 ULTRA	METAL	large-v2-dis	1	212.51	1.20	0.47	0.02	`22c96b4`

CPU	Config	Model	Th	FA	Enc.	Dec.	Bch5	PP	Commit
M2 ULTRA	METAL	tiny	1	1	9.07	1.33	0.45	0.01	`22c96b4`
M2 ULTRA	METAL	tiny-q5_0	1	1	9.74	1.33	0.47	0.01	`22c96b4`
M2 ULTRA	METAL	tiny-q5_1	1	1	8.93	1.31	0.46	0.01	`22c96b4`
M2 ULTRA	METAL	base	1	1	15.75	1.87	0.71	0.02	`22c96b4`
M2 ULTRA	METAL	base-q5_0	1	1	17.04	1.83	0.74	0.02	`22c96b4`
M2 ULTRA	METAL	base-q5_1	1	1	17.17	1.83	0.74	0.02	`22c96b4`
M2 ULTRA	METAL	small	1	1	42.33	3.64	1.60	0.05	`22c96b4`
M2 ULTRA	METAL	small-q5_0	1	1	47.61	3.63	1.70	0.05	`22c96b4`
M2 ULTRA	METAL	small-q5_1	1	1	47.70	3.66	1.68	0.05	`22c96b4`
M2 ULTRA	METAL	medium	1	1	114.42	7.53	3.55	0.11	`22c96b4`
M2 ULTRA	METAL	medium-q5_0	1	1	132.63	7.02	3.77	0.13	`22c96b4`
M2 ULTRA	METAL	medium-q5_1	1	1	132.28	7.10	3.76	0.13	`22c96b4`
M2 ULTRA	METAL	medium-dis	1	1	102.34	1.01	0.42	0.01	`22c96b4`
M2 ULTRA	METAL	large-v2	1	1	203.01	11.03	5.45	0.20	`22c96b4`
M2 ULTRA	METAL	large-v2-q5_0	1	1	240.05	10.18	5.98	0.23	`22c96b4`
M2 ULTRA	METAL	large-v2-q5_1	1	1	239.22	10.23	5.87	0.23	`22c96b4`
M2 ULTRA	METAL	large-v2-dis	1	1	181.14	1.14	0.48	0.02	`22c96b4`

Ryzen 9 5950X + RTX 2060

CPU	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
Ryzen 9 5950X	AVX2	tiny	8	195.29	1.57	0.51	0.26	`22c96b4`
Ryzen 9 5950X	AVX2	tiny-q5_0	8	213.33	1.10	0.50	0.30	`22c96b4`
Ryzen 9 5950X	AVX2	tiny-q5_1	8	219.38	1.18	0.53	0.32	`22c96b4`
Ryzen 9 5950X	AVX2	base	8	424.85	3.71	1.03	0.46	`22c96b4`
Ryzen 9 5950X	AVX2	base-q5_0	8	473.61	1.81	0.82	0.52	`22c96b4`
Ryzen 9 5950X	AVX2	base-q5_1	8	484.14	1.92	0.85	0.56	`22c96b4`
Ryzen 9 5950X	AVX2	small	8	1458.32	12.66	3.09	1.26	`22c96b4`
Ryzen 9 5950X	AVX2	small-q5_0	8	1673.22	6.42	2.18	1.45	`22c96b4`
Ryzen 9 5950X	AVX2	small-q5_1	8	1724.78	6.72	2.32	1.52	`22c96b4`
Ryzen 9 5950X	AVX2	medium	8	4333.87	36.80	8.56	3.37	`22c96b4`
Ryzen 9 5950X	AVX2	medium-q5_0	8	5194.09	19.21	5.71	3.97	`22c96b4`
Ryzen 9 5950X	AVX2	medium-q5_1	8	5450.39	20.01	5.99	4.17	`22c96b4`
Ryzen 9 5950X	AVX2	medium-dis	8	3995.19	5.08	1.21	0.55	`22c96b4`
Ryzen 9 5950X	AVX2	large-v2	8	8056.16	69.74	16.11	6.13	`22c96b4`
Ryzen 9 5950X	AVX2	large-v2-q5_0	8	9799.58	35.16	10.49	7.28	`22c96b4`
Ryzen 9 5950X	AVX2	large-v2-q5_1	8	ms	36.74	11.02	7.65	`22c96b4`
Ryzen 9 5950X	AVX2	large-v2-dis	8	7490.03	7.40	1.70	0.72	`22c96b4`

GPU	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
RTX 2060	AVX2 CUDA	tiny	8	12.54	0.93	0.29	0.02	`22c96b4`
RTX 2060	AVX2 CUDA	tiny-q5_0	8	12.73	0.98	0.24	0.02	`22c96b4`
RTX 2060	AVX2 CUDA	tiny-q5_1	8	12.72	0.99	0.24	0.02	`22c96b4`
RTX 2060	AVX2 CUDA	base	8	24.14	1.28	0.41	0.03	`22c96b4`
RTX 2060	AVX2 CUDA	base-q5_0	8	24.58	1.38	0.35	0.03	`22c96b4`
RTX 2060	AVX2 CUDA	base-q5_1	8	24.58	1.37	0.35	0.03	`22c96b4`
RTX 2060	AVX2 CUDA	small	8	74.70	2.91	0.84	0.07	`22c96b4`
RTX 2060	AVX2 CUDA	small-q5_0	8	76.12	2.84	0.77	0.08	`22c96b4`
RTX 2060	AVX2 CUDA	small-q5_1	8	76.14	2.84	0.76	0.08	`22c96b4`
RTX 2060	AVX2 CUDA	medium	8	200.69	6.46	1.83	0.17	`22c96b4`
RTX 2060	AVX2 CUDA	medium-q5_0	8	204.80	5.90	1.65	0.19	`22c96b4`
RTX 2060	AVX2 CUDA	medium-q5_1	8	205.61	5.85	1.61	0.19	`22c96b4`
RTX 2060	AVX2 CUDA	medium-dis	8	186.17	0.86	0.24	0.02	`22c96b4`
RTX 2060	AVX2 CUDA	large-v2	8	347.22	10.36	2.82	0.29	`22c96b4`
RTX 2060	AVX2 CUDA	large-v2-q5_0	8	357.06	8.81	2.58	0.34	`22c96b4`
RTX 2060	AVX2 CUDA	large-v2-q5_1	8	356.97	8.62	2.49	0.33	`22c96b4`
RTX 2060	AVX2 CUDA	large-v2-dis	8	318.05	1.03	0.34	0.04	`22c96b4`

GPU	Config	Model	Th	FA	Enc.	Dec.	Bch5	PP	Commit
RTX 2060	AVX2 CUDA	tiny	8	1	7.21	0.76	0.29	0.02	`22c96b4`
RTX 2060	AVX2 CUDA	tiny-q5_0	8	1	7.42	0.82	0.18	0.02	`22c96b4`
RTX 2060	AVX2 CUDA	tiny-q5_1	8	1	7.38	0.82	0.18	0.02	`22c96b4`
RTX 2060	AVX2 CUDA	...

Contributors

iboB, przemoc, and 14 other contributors

Assets 10

6 Join discussion

16 Apr 11:14

ggerganov

v1.5.5

7395c70

v1.5.5

Overview

Many small incremental updates + Token level timestamps with DTW by @denersc in #1485
Feedback is welcome!

Full Changelog: v1.5.4...v1.5.5

What's Changed

server : fix server temperature + add temperature_inc by @ggerganov in #1729
main : add cli option to disable system prints by @ggerganov in #1740
server: add request path by @eschmidbauer in #1741
Optional Piper TTS support for talk-llama example. by @RhinoDevel in #1749
fix/1748 by @nank1ro in #1750
Don't compute timestamps when not printing them. by @ghindle in #1755
Add more parameters to server api by @ghindle in #1754
Add SetInitialPrompt method to go bindings by @blib in #1753
ggml : fix 32-bit ARM compat for IQ2_XS by @ggerganov in #1758
refactor: get all scripts to be POSIX Compliant by @sonphantrung in #1725
whisper : load the model into multiple buffers of max size 1GB by @ggerganov in #1763
rebase against your -np changes (thx) and add better python file to be used on the command line or as library by @contractorwolf in #1744
examples/talk-llama: Add optional commandline parameter to set the bot name. by @RhinoDevel in #1764
server : fix building and simplify lib deps on Windows by @przemoc in #1772
talk-llama: optional wake-up command and audio confirmation by @Rakksor in #1765
examples/server: implement "verbose_json" format with token details by @rmmh in #1781
whisper.android: Return output from benchmarks by @luciferous in #1785
libwhisper.so should be position independent by @trixirt in #1792
Docs: try to make model options / model install methods clearer by @mrienstra in #1806
common : fix input buffer check by @ggerganov in #1812
Update Makefile by @jwijffels in #1813
Add fields to verbose_json response and show examples on the home page by @JacobLinCool in #1802
common: fix wav buffer detection by @JacobLinCool in #1819
Add macOS deployment target option to Makefile by @didzis in #1839
Expose CUDA device setting in public API by @didzis in #1840
whisper.android: How to build with CLBlast by @luciferous in #1809
server: Allow CORS request with authorization headers by @valenting in #1850
Embed Metal library source into compiled binary by @didzis in #1842
added audio_ctx argument to main and server examples by @dscripka in #1857
whisper : fix external encoder by @ggerganov in #1860
swift : package no longer use ggml dependency by @ggerganov in #1861
fix openvino setup docs by @jumpers775 in #1874
clean up common code in examples by @felrock in #1871
main : check if input files exist before proceeding by @Theldus in #1872
Linking issue fix via Makefile when CUBLAS enabled in the WSL #1876 by @lbluep in #1878
main : fix file existence check in main.cpp by @Theldus in #1889
openvino : fix convert-whisper-to-openvino.py for v2023.0.0 (#1870) by @st-gr in #1890
ggml : 32-bit arm compat by @ggerganov in #1891
Add SYCL logic in whisper by @abhilash1910 in #1863
talk and talk-llama: Pass text_to_speak as a file by @tamo in #1865
Stream.wasm: Fix invalid memory access when no segments are returned by @Andrews54757 in #1902
Update README to Recommend MacOS Sonoma for Core ML to avoid hallucination by @gavin1818 in #1917
Add library versioning by @kenneth-ge in #1352
Fix SF(segment fault) issue in Android JNI by @zhouwg in #1929
Fix typo in source file whisper.cpp by @zhouwg in #1925
bench:fix typo by @zhouwg in #1933
Auto lowercase language parameter by @F1L1Pv2 in #1928
ggml : try fix 32-bit arm compat by @ggerganov in #1938
whisper : make beam candidate sort more stable by @josharian in #1943
bindings/go : add linker flags to make metal work by @josharian in #1944
whisper : improve beam search candidate diversity by @josharian in #1947
whisper : document whisper_batch.n_seq_id by @josharian in #1942
Rename --audio-context to --audio-ctx, as per help text by @joliss in #1953
[DRAFT] Token level timestamps with DTW (#375) by @denersc in #1485
Fedora dependencies needed (SDL2) by @Man2Dev in #1970
libcuda.so.1 in PATH in Docker Container by @tiagofassoni in #1966
ruby : fix build by @ggerganov in #1980
Improve support for distil-large-v3 by @sanchit-gandhi in #1982
whisper : improve handling of prompts by @ggerganov in #1981
sync : ggml by @ggerganov in #2001
Implemented command-style grammar in the main example. by @ulatekh in #1998
Use pkg-config for OpenBLAS by @przemoc in #1778
ci : add building in MSYS2 environments (Windows) by @przemoc in #1994
Support CUDA versions < 11.1 by @primenko-v in #2020
Create solution folders in the CMake build by @ulatekh in #2004
Allow a regular expression to describe tokens to suppress by @ulatekh in #1997
"main" example now allows a response-file as the sole parameter by @ulatekh in #2019
Support for CPU BLAS build via Intel MKL by @slashlib in #2024
Set stdin to binary mode on Windows. Fixes #2023 by @rotemdan in #2025
Fix file-handle leak in read_wav() by @ulatekh in #2026
Fix DTW memory access by @bradmurray-dt in #2012
whisper: update grammar-parser.cpp by @eltociear in #2058
fix missing reference to "model" variable in actual shell command run in whisper.nvim by @sixcircuit in #2049
build : detect AVX512 in Makefile, add AVX512 option in CMake by @didzis in #2043
feature/no timestamps node by @pprobst in #2048
Update embedded Metal library generation process to include dependency by @didzis in #2045
server.cpp: add dtw by @eschmidbauer in #2044

New Contributors

@eschmidbauer made their first contribution in #1741
@RhinoDevel made their first contribution in #1749
@nank1ro made their first contribution in #1750
@ghindle made their first contribution in #1755
@blib made their first contribution in #1753
@sonphantrung made their first contribution in #1725
@contractorwolf made their first contribution in #1744
@Rakksor made their first contribution in #1765
@rmmh made their f...

Contributors

luciferous, ghindle, and 43 other contributors

Assets 2

8 Join discussion

05 Jan 15:20

ggerganov

v1.5.4

0b9af32

v1.5.4

Overview

Faster Core ML ANE models (#1716)
CUDA bugfix causing random erros in the transcription
Fix SwiftUI example build

Full Changelog: v1.5.3...v1.5.4

Assets 11

03 Jan 17:39

ggerganov

v1.5.3

9962371

v1.5.3

Overview

Minor maintenance release:

Fix CUDA issues where the transcription produces garbage
FIX quantized models to work with CUDA backend
Allow to use whisper.cpp and llama.cpp together in SwiftUI projects

What's Changed

Update bench.py by @ForkedInTime in #1655
cmake : Resolve quantized model issue when CUBLAS enabled by @bobqianic in #1667
examples : Revert CMakeLists.txt for talk-llama by @bobqianic in #1669
CI : Add coverage for talk-llama when WHISPER_CUBLAS=1 by @bobqianic in #1672
ci: build and push docker image by @OpenWaygate in #1674
sync : ggml (ggml_scale, ggml_row_size, etc.) by @ggerganov in #1677
Replace WHISPER_PRINT_DEBUG with WHISPER_LOG_DEBUG by @bobqianic in #1681
download: Fix large q5 model name by @dimopep in #1695
sync : ggml (VMM, sync-ggml-am.sh, dotprod ARM fixes) by @ggerganov in #1691
whisper : replace tensor->n_dims with ggml_n_dims(tensor) by @bobqianic in #1694
Build with CLBlast by @tamo in #1576
docker : Fix the Publishing of the CUDA Docker Image by @bobqianic in #1704
emscripten: fix "Stack Overflow!" by @Huguet57 in #1713
sync : ggml by @ggerganov in #1717
Add error handling to graph_compute by @finnvoor in #1714
Updates Package.swift to use ggml as package dependency by @1-ashraful-islam in #1701

New Contributors

@ForkedInTime made their first contribution in #1655
@OpenWaygate made their first contribution in #1674
@dimopep made their first contribution in #1695
@Huguet57 made their first contribution in #1713
@1-ashraful-islam made their first contribution in #1701

Full Changelog: v1.5.2...v1.5.3

Contributors

tamo, ggerganov, and 7 other contributors

Assets 11

0 Join discussion

14 Dec 16:06

ggerganov

v1.5.2

88112c8

v1.5.2

Overview

Minor maintenance release:

Re-enable CPU BLAS processing after fixing a regression (#1583)

Add new example: wchess

wchess-0.mp4

Shoutout to @fraxy-v (implementation) and @ejones (grammar) for making it work!

What's Changed

automatically convert audio on the server by @sapoepsilon in #1539
CI : Rectify the Clang-Related workflow issues by @bobqianic in #1551
CI : Add CUDA 11.8.0 support by @bobqianic in #1554
Update main program help info by @bebound in #1560
Set default CORS headers to allow all by @kasumi-1 in #1567
cmake : install required ggml.h header by @gjasny in #1568
Backport .srt output format to examples/server by @osdrv in #1565
Added support for .vtt format to Whisper server by @aleksanderandrzejewski in #1578
ggml : re-enable blas for src0 != F32 by @ggerganov in #1583
Fix 32-bit compiler warning by @Digipom in #1575
Remove #if arch(arm) check in Swift Package Manager by @finnvoor in #1561
Pass max-len argument to server wparams by @osdrv in #1574
sync : ggml (new ops, new backend, etc) by @ggerganov in #1602
Fix ggml_metal_log on Intel macs by @finnvoor in #1606
Update CMakeLists.txt by @Kreijstal in #1615
target windows 8 or above for prefetchVirtualMemory in llama-talk by @Kreijstal in #1617
sync : ggml (Metal fixes, new ops, tests) by @ggerganov in #1633
wchess: whisper assisted chess by @fraxy-v in #1595

New Contributors

@sapoepsilon made their first contribution in #1539
@bebound made their first contribution in #1560
@kasumi-1 made their first contribution in #1567
@gjasny made their first contribution in #1568
@osdrv made their first contribution in #1565
@aleksanderandrzejewski made their first contribution in #1578
@Kreijstal made their first contribution in #1615
@fraxy-v made their first contribution in #1595

Full Changelog: v1.5.1...v1.5.2

Contributors

osdrv, ejones, and 11 other contributors

Assets 11

0 Join discussion

24 Nov 10:45

ggerganov

v1.5.1

9d6ebd8

v1.5.1

Overview

Minor update:

With Metal, auto-fallback to CPU if device does not support Apple7 family
Add server example

What's Changed

ISSUE-1329: replace " with ' so it doesn't try to execute code in backticks by @spullara in #1364
sync : ggml (ggml-alloc + linker + gguf fixes) by @ggerganov in #1501
Fixed with_state methods, to use the correct state by @sandrohanea in #1519
#1517 Redistribute CUDA DLLs by @tamo in #1522
whisper : reuse whisper_decode_with_state by @ggerganov in #1521
sdl : fix audio callback by @ggerganov in #1523
update deprecated example by @MightyStud in #1529
Super Simple Whisper Server by @felrock in #1380
Close file after writing in server application by @felrock in #1533
bench : multi-thread memcpy by @ggerganov in #1534
Change temp file name for server application by @felrock in #1535
Fixed Makefile for MacOS ARM 64 Go bindings by @gleicon in #1530
Fixed metal build on macos-latest by @sandrohanea in #1544
fix(server): typo in temperature parameter by @Okabintaro in #1545
Request to add a new function to get the full language name by @bradmit in #1546
server : add --print-realtime param by @ecneladis in #1541
cuda : sync some minor stuff from llama.cpp by @ggerganov in #1548
metal : add backend function to check device family support by @ggerganov in #1547

New Contributors

@spullara made their first contribution in #1364
@MightyStud made their first contribution in #1529
@felrock made their first contribution in #1380
@gleicon made their first contribution in #1530
@Okabintaro made their first contribution in #1545
@bradmit made their first contribution in #1546
@ecneladis made their first contribution in #1541

Full Changelog: v1.5.0...v1.5.1

Contributors

spullara, gleicon, and 8 other contributors

Assets 11

Releases: ggerganov/whisper.cpp

v1.7.1

Overview

What's Changed

New Contributors

Binaries

Contributors

v1.7.0

Overview

M2 Ultra

Ryzen 9 5950X + RTX 2060

Contributors

v1.6.2

Overview

What's Changed

New Contributors

Contributors

v1.6.1

What's Changed

New Contributors

Contributors

v1.6.0

Overview

M1 Pro

M2 Ultra

Ryzen 9 5950X + RTX 2060

Contributors

v1.5.5

Overview

What's Changed

New Contributors

Contributors

v1.5.4

Overview

v1.5.3

Overview

What's Changed

New Contributors

Contributors

v1.5.2

Overview

What's Changed

New Contributors

Contributors

v1.5.1

Overview

What's Changed

New Contributors

Contributors