OpenVINO
New architectures
LCMs
- Add OpenVINO export CLI by @echarlaix in #437
optimum-cli export openvino --model gpt2 ov_model
- Enable Latent Consistency models OpenVINO export and inference by @echarlaix in #463
from optimum.intel import OVLatentConsistencyModelPipeline
pipe = OVLatentConsistencyModelPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
images = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images
Pix2Struct
GPTBigCode
- Add support for export and inference for GPTBigCode models by @echarlaix in #459
Changes and bugfixes
- Move VAE execution to fp32 precision on GPU by @eaidova in #432
- Enable OpenVINO export without ONNX export step by @eaidova in #397
- Enable 8-bit weight compression for OpenVINO model by @l-bat in #415
- Add image reshaping for statically reshaped OpenVINO SD models by @echarlaix in #428
- OpenVINO device updates by @helena-intel in #434
- Fix decoder model without cache by @echarlaix in #438
- Fix export by @echarlaix in #439
- Added 8 bit weights compression by default for decoders larger than 1B by @AlexKoff88 in #444
- Add fp16 and int8 conversion to OVModels and export CLI by @echarlaix in #443
model = OVModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)
- Create default attention mask when needed but not provided by @eaidova in #457
- Do not automatically cache models when exporting a model in a temporary directory by @helena-intel in #462
Neural Compressor
- Integrate INC weight-only quantization by @mengniwang95 in #417
- Support num_key_value_heads by @jiqing-feng in #447
- Enable ORT model support to INC quantizer by @echarlaix in #436
- fix INC model loading by @echarlaix in #452
- Fix INC modeling by @echarlaix in #453
- Add starcode past-kv shape for TSModelForCausal class by @changwangss in #371
- Fix transformers v4.35.0 compatibility by @echarlaix in #471
- Fix compatibility for optimum next release by @echarlaix in #460
Full Changelog: https://github.com/huggingface/optimum-intel/commits/v1.12.0