huggingface/optimum-intel v1.12.0 on GitHub

OpenVINO

New architectures

LCMs

Add OpenVINO export CLI by @echarlaix in #437

optimum-cli export openvino --model gpt2 ov_model

Enable Latent Consistency models OpenVINO export and inference by @echarlaix in #463

from optimum.intel import OVLatentConsistencyModelPipeline

pipe = OVLatentConsistencyModelPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
images = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images

Pix2Struct

Add support for export and inference for pix2struct models by @eaidova in #450

GPTBigCode

Add support for export and inference for GPTBigCode models by @echarlaix in #459

Changes and bugfixes

Move VAE execution to fp32 precision on GPU by @eaidova in #432
Enable OpenVINO export without ONNX export step by @eaidova in #397
Enable 8-bit weight compression for OpenVINO model by @l-bat in #415
Add image reshaping for statically reshaped OpenVINO SD models by @echarlaix in #428
OpenVINO device updates by @helena-intel in #434
Fix decoder model without cache by @echarlaix in #438
Fix export by @echarlaix in #439
Added 8 bit weights compression by default for decoders larger than 1B by @AlexKoff88 in #444
Add fp16 and int8 conversion to OVModels and export CLI by @echarlaix in #443

model = OVModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)

Create default attention mask when needed but not provided by @eaidova in #457
Do not automatically cache models when exporting a model in a temporary directory by @helena-intel in #462

Neural Compressor

Integrate INC weight-only quantization by @mengniwang95 in #417
Support num_key_value_heads by @jiqing-feng in #447
Enable ORT model support to INC quantizer by @echarlaix in #436
fix INC model loading by @echarlaix in #452
Fix INC modeling by @echarlaix in #453
Add starcode past-kv shape for TSModelForCausal class by @changwangss in #371
Fix transformers v4.35.0 compatibility by @echarlaix in #471
Fix compatibility for optimum next release by @echarlaix in #460

Full Changelog: https://github.com/huggingface/optimum-intel/commits/v1.12.0

huggingface/optimum-intel v1.12.0 v1.12.0: Weight only quantization, LCM, Pix2Struct , GPTBigCode on GitHub

OpenVINO

New architectures

LCMs

Pix2Struct

GPTBigCode

Changes and bugfixes

Neural Compressor

huggingface/optimum-intel v1.12.0
v1.12.0: Weight only quantization, LCM, Pix2Struct , GPTBigCode

on GitHub