huggingface/optimum-intel v1.26.0 on GitHub

️🏗 New architectures support

Add OpenVINO support for gpt-oss by @echarlaix in #1428
Add OpenVINO support for MiniCPM-o by @rkazants in #1454

🚀 New Features

Add feature-extraction and text-classification support for Qwen3 export by @openvino-dev-samples in #1415
VLM Vision Encoder full quantization by @nikita-savelyevv in #1394
Add support transformers v4.54 v4.55 by @echarlaix in #1406
New pipelines by @IlyasMoutawwakil in #1462
IPEX transformers upgrade to 4.55 by @kaixuanliu in #1467
Adopt new NNCF mxfp4 quantization logic by @nikita-savelyevv in #1465

🔧 Enhancements & Fixes

Fix for diffusers 0.35 (and fix and speedup documentation build with uv) by @IlyasMoutawwakil in #1426
Fix regexp error with searching onnx model by @sbalandi in #1442
Fix high memory consumption during vision encoder NNCF quantization by @nikita-savelyevv in #1440
Fix disk by distributing tests by @IlyasMoutawwakil in #1438
Fix CI rate limiting by @IlyasMoutawwakil in #1449
Add custom task inferring logic for mistralai/Mistral-7B-Instruct-v0.3 by @nikita-savelyevv in #1413
Fix editable mode and cleanup/adapt for optimum v2 by @IlyasMoutawwakil in #1457
Faster cli startup / helper by @IlyasMoutawwakil in #1460
Fix OpenVINO VLM in-place static quantization by @nikita-savelyevv in #1464
Remove datasets as required dependency by @echarlaix in #1466
Fix OpenVINO data-free pipeline quantization by @nikita-savelyevv in #1477
Skip concat_qkv creation for TP mode by @kaixuanliu in #1481
Introduce OPENVINO_TEST_DEVICE by @IlyasMoutawwakil in #1479
Add workaround logic for OpenVINO default int4 quantization of gpt-oss models by @nikita-savelyevv in #1490
Apply chat_template for MiniCPM-o-2_6 by @Wovchena in #1484

🧹 Deprecations

Deprecate Quantizer support of nn.Module by @echarlaix in #1421

New Contributors

@TharescdqTuA made their first contribution in #1436

What's Changed

* Fix for diffusers 0.35 (and fix and speedup documentation build with uv) by @IlyasMoutawwakil in https://github.com//pull/1426 * [OpenVINO]to support qwen3 embedding/rerank by @openvino-dev-samples in https://github.com//pull/1415 * Remove reference to docker based doc building in Makefile by @IlyasMoutawwakil in https://github.com//pull/1430 * VLM Vision Encoder full quantization by @nikita-savelyevv in https://github.com//pull/1394 * Fix main documentation build/push by @IlyasMoutawwakil in https://github.com//pull/1432 * [Docs] Export whisper as stateless during quantization by @nikita-savelyevv in https://github.com//pull/1435 * chore: Fix Misspellings by @TharescdqTuA in https://github.com//pull/1436 * Skip marian tests on OpenVINO 2025.3 by @nikita-savelyevv in https://github.com//pull/1437 * [OV] Remove local imports from quantization.py by @nikita-savelyevv in https://github.com//pull/1433 * Update references after OpenVINO 2025.3 release by @nikita-savelyevv in https://github.com//pull/1441 * Fix regexp error with searching onnx model by @sbalandi in https://github.com//pull/1442 * [OV] Fix high memory consumption during vision encoder quantization by @nikita-savelyevv in https://github.com//pull/1440 * Fix disk by distributing tests by @IlyasMoutawwakil in https://github.com//pull/1438 * chore: Fix Misspellings by @TharescdqTuA in https://github.com//pull/1443 * Benchmarks updates by @ezelanza in https://github.com//pull/1439 * Deprecate Quantizer support of nn.Module by @echarlaix in https://github.com//pull/1421 * Add support transformers v4.54 v4.55 by @echarlaix in https://github.com//pull/1406 * Fix CI rate limiting by @IlyasMoutawwakil in https://github.com//pull/1449 * Add OpenVINO support for gpt-oss by @echarlaix in https://github.com//pull/1428 * Remove legacy code needed for transformers < v4.45 by @echarlaix in https://github.com//pull/1451 * add missing is_transformers_version by @echarlaix in https://github.com//pull/1453 * Add custom task inferring logic for mistralai/Mistral-7B-Instruct-v0.3 by @nikita-savelyevv in https://github.com//pull/1413 * Fix editable mode and cleanup/adapt for optimum v2 by @IlyasMoutawwakil in https://github.com//pull/1457 * [OpenVINO] Support openbmb/MiniCPM-o-2_6 for image-text-to-text task by @rkazants in https://github.com//pull/1454 * Faster cli startup / helper by @IlyasMoutawwakil in https://github.com//pull/1460 * set MAX_TRANSFORMERS_VERSION as inclusive upper bound by @echarlaix in https://github.com//pull/1463 * [OV] Fix VLM in-place static quantization by @nikita-savelyevv in https://github.com//pull/1464 * New pipelines by @IlyasMoutawwakil in https://github.com//pull/1462 * Set optimum-onnx version in setup by @echarlaix in https://github.com//pull/1472 * IPEX transformers upgrade to 4.55 by @kaixuanliu in https://github.com//pull/1467 * CI secret with public token fallback by @IlyasMoutawwakil in https://github.com//pull/1473 * Notebook_updated with benchmark by @ezelanza in https://github.com//pull/1475 * Add note about supported transformers version to export docs by @helena-intel in https://github.com//pull/1476 * [OpenVINO] Fix data-free pipeline quantization by @nikita-savelyevv in https://github.com//pull/1477 * [OpenVINO] Adopt new mxfp4 quantization logic by @nikita-savelyevv in https://github.com//pull/1465 * Remove datasets as required dependency by @echarlaix in https://github.com//pull/1466 * [OpenVINO] Add model inference check to weight-only and pipeline quantization testing by @nikita-savelyevv in https://github.com//pull/1470 * Skip concat_qkv creation for TP mode by @kaixuanliu in https://github.com//pull/1481 * Update OpenVINO workflows to run on Python 3.10 by @nikita-savelyevv in https://github.com//pull/1483 * Introduce OPENVINO_TEST_DEVICE by @IlyasMoutawwakil in https://github.com//pull/1479 * Fix PR upload documentation by @echarlaix in https://github.com//pull/1487 * [OpenVINO][Docs] Update docs to mention how to apply default 4-bit config via Python API by @nikita-savelyevv in https://github.com//pull/1486 * [OpenVINO] Add transformers==4.52 constraint to vision_language_quantization notebook by @nikita-savelyevv in https://github.com//pull/1489 * [OpenVINO] Add workaround logic for default int4 quantization of openai/gpt-oss-20b model by @nikita-savelyevv in https://github.com//pull/1490 * `_GPTOSSQuantizationConfig` follow-up by @nikita-savelyevv in https://github.com//pull/1492 * Apply chat_template for MiniCPM-o-2_6 by @Wovchena in https://github.com//pull/1484 * Update inc workflow python version by @echarlaix in https://github.com//pull/1494

Compatible with transformers>=4.45,<4.56

Full Changelog: v1.25.2...v1.26.0

huggingface/optimum-intel v1.26.0 v1.26.0: GPT OSS, MiniCPM-o on GitHub

️🏗 ​New architectures support