OpenVINO
- Add quantization of Whisper pipeline by @nikita-savelyevv in #1040
- Add Qwen2-VL support by @eaidova in #1042
- Add AWQ models support by @mvafin in #1049
- Update default OV configuration by @KodiaqQ in #1057
- Introduce
--quant-mode
cli argument enabling full quantization via optimum-cli by @nikita-savelyevv in #1061 - Merge decoder and decoder with past to stateful for seq2seq models by @eaidova in #1078
- Add transformers 4.47 support by @IlyasMoutawwakil in #1088
- Add GLM-Edge models support by @eaidova in #1089
- Add Granite and GraniteMoe models support by @eaidova in #1099
- Add fp8 implementation by @KodiaqQ in #1100
- Add Flux Fill inpainting pipeline support by @eaidova in #1095
- Add Sana support by @eaidova in #1106
- Add v4.48 transformers support by @IlyasMoutawwakil in #1136
IPEX
- Add support to sentence transformers models by @echarlaix in #1034
from optimum.intel import IPEXSentenceTransformer
model = IPEXSentenceTransformer.from_pretrained(model_id)
- Add support to text-to-text task by @jiqing-feng in #1054
from optimum.intel import IPEXModelForSeq2SeqLM
model = IPEXModelForSeq2SeqLM.from_pretrained(model_id)
- Enable Flash Attention by @jiqing-feng in #1065
Neural Compressor
- Add INC layerwise quantization support by @changwangss in #1018
Full Changelog: https://github.com/huggingface/optimum-intel/commits/v1.22.0