🚀 New Features & Enhancements
Optimum 2.26 compatibility by @IlyasMoutawwakil in #1352
OpenVINO
- Introduce default full quantization configs for clip models by @nikita-savelyevv in #1302
- Introduce OVPipelineQuantizationConfig by @nikita-savelyevv in #1310
- Add int8 PTQ configs for some fill-mask models by @nikita-savelyevv in #1331
- Add transformers v4.52 compatibility by @eaidova in #1319
- Add compression config for Qwen/Qwen2.5-Coder-3B-Instruct by @MaximProshin in #1355
- [OV] Add support for data-free AWQ by @nikita-savelyevv in #1349
- Convert dataclasses to dicts in quantization config before saving by @nikita-savelyevv in #1362
- Remove reshaping for stateful decoders by @echarlaix in #1333
IPEX
- Add transformers v4.52 compatibility by @jiqing-feng in #1317
🔧 Key Fixes & Optimizations
- Raise if converted subcomponent not found by @echarlaix in #1303
- Keep Hybrid Quantization only for diffusion pipelines by @nikita-savelyevv in #1313
- Fix whisper with auto language detection by @eaidova in #1314
- Fix vision embeddings export for maira by @eaidova in #1320
- Fix VLM calibration dataset collection by @nikita-savelyevv in #1321
- Resize large images during VLM calibration data collection by @nikita-savelyevv in #1322
- Resolve logger warnings by @emmanuel-ferdman in #1324
- Fix progress bar during calibration dataset collection by @nikita-savelyevv in #1323
- Fix ESM models export and add it to supported by @eaidova in #1328
- Allow skip trace check for sentence stransformers by @eaidova in #1332
- Fix int value recompile by @jiqing-feng in #1335
- Fix TP tensor dimension dismatch for IPEX models by @kaixuanliu in #1340
- Updated Qwen3-8b compression config by @MaximProshin in #1341
New Contributors
- @kilavvy made their first contribution in #1345
- @maximevtush made their first contribution in #1347
- @leopardracer made their first contribution in #1351
What's Changed
- Dev version by @echarlaix in #1309
- Update number of int8 nodes for Segment Anything model by @nikita-savelyevv in #1311
- [OV][Docs] Keep Hybrid Quantization only for diffusion pipelines by @nikita-savelyevv in #1313
- raise if converted subcomponent not found by @echarlaix in #1303
- [OV] Introduce default full quantization configs for clip models by @nikita-savelyevv in #1302
- fix whisper with auto language detection by @eaidova in #1314
- fix vision embeddings export for maira by @eaidova in #1320
- [OV] Fix VLM calibration dataset collection by @nikita-savelyevv in #1321
- [OV] Resize large images during VLM calibration data collection by @nikita-savelyevv in #1322
- Resolve logger warnings by @emmanuel-ferdman in #1324
- [OV] Fix progress bar during calibration dataset collection by @nikita-savelyevv in #1323
- Limit INC version to fix CI. by @changwangss in #1325
- [OV] Update AWQ test to pass on NNCF develop by @nikita-savelyevv in #1326
- Fix ESM models export and add it to supported by @eaidova in #1328
- Introduce OVPipelineQuantizationConfig by @nikita-savelyevv in #1310
- [OV] Add int8 PTQ configs for some fill-mask models. by @nikita-savelyevv in #1331
- allow skip trace check for sentence stransformers by @eaidova in #1332
- fix int value recompile by @jiqing-feng in #1335
- Add style bot by @echarlaix in #1337
- Fix setup.py to support INC latest version 3.4.1 by @changwangss in #1339
- fix bug when using tp, tensor dimension dismatch by @kaixuanliu in #1340
- fix optimum version by @echarlaix in #1344
- Updated Qwen3-8b compression config by @MaximProshin in #1341
- Fix Typo in Error Message for Sequence Length Validation by @kilavvy in #1345
- Fix Typographical Errors in Documentation String by @maximevtush in #1347
- upgrade windows runner image by @echarlaix in #1350
- Upgrade transformers version to 4.52 for ipex patching by @jiqing-feng in #1317
- Minor Typo Fixes in Comments for Quantized Generation Demo Notebook by @leopardracer in #1351
- fix openvino for compatibility with transformers 4.52 by @eaidova in #1319
- Optimum 2.26 compatibility by @IlyasMoutawwakil in #1352
- [OV] Update reference number of fp8 fake convert nodes by @nikita-savelyevv in #1348
- Compression config for Qwen/Qwen2.5-Coder-3B-Instruct by @MaximProshin in #1355
- Docs: Fix typos in quantized generation demo notebook by @kilavvy in #1356
- update style bot permission and token by @echarlaix in #1357
- [OV] Add support for data-free AWQ by @nikita-savelyevv in #1349
- Add documentation workflow by @echarlaix in #1361
- Fix style by @echarlaix in #1363
- fix by @echarlaix in #1364
- Fix documentation workflow by @echarlaix in #1365
- Convert dataclasses to dicts in quantization config before saving by @nikita-savelyevv in #1362
- Remove reshaping for stateful decoders by @echarlaix in #1333
Full Changelog: https://github.com/huggingface/optimum-intel/commits/v1.24.0