- Fix model patching for internlm2 by @eaidova in #814
- Fix loading models from cache by @eaidova in #820
- Disable tpp for un-verified models by @jiqing-feng in #822
- Update default NNCF configurationsby @KodiaqQ in #824
- Fix update causal mask for transformers 4.42 by @eaidova in #852
- Fix bf16 inference accuracy for mistral, phi3, dbrx by @eaidova in #833
- Revert rotary embedding patching for recovering gpu accuracy by @eaidova in #855
- Support transformers 4.43 by @IlyasMoutawwakil in #856
Full Changelog: v1.18.1...v1.18.2