Major Features and Improvements
Intel® Extension for TensorFlow* extends the official TensorFlow capabilities, allowing TensorFlow workloads to run on Intel® Data Center GPU Max Series, Intel® Data Center GPU Flex Series, and Intel® Xeon® Scalable Processors. This release includes the following major features and improvements:
-
Updated Support: The Intel® Extension for TensorFlow* has been upgraded to support TensorFlow 2.15, the version released by Google and required for this release.
-
Toolkit Support: Supports Intel® oneAPI Base Toolkit 2024.1.
-
NextPluggableDevice integration: Integrates NextPluggableDevice (an advanced generation of the PluggableDevice mechanism) as a new device type to enable seamless integration of new accelerator plugin. For more details, see the NextPluggableDevice Overview.
-
Experimental support: Provides experimental support for Intel GPU backend for OpenXLA, enabling OpenXLA GPU backend in Intel® Extension for TensorFlow* via PJRT plugin. For more details, see the OpenXLA.
-
Compiler enablement: Enables Clang compiler to build Intel® Extension for TensorFlow* CPU wheels starting with this release. The currently supported version is LLVM/clang 17. The official Wheels, published on PyPI, will be based on Clang; however, users can choose to build wheels using the GCC compiler by following the steps in the Configure For CPU guide.
-
Performance optimization: Enables weight pre-pack support for Intel® Extension for TensorFlow* CPU to provide better performance and reduce memory footprint of
_ITEXMatMul
and_ITEXFusedMatMul
. For more details, see the Weight Pre-Pack. -
Package redefinition: Re-defines XPU package to support GPU backend only starting with this release. The official XPU wheels published on PyPI will support only the GPU backend, and the GPU wheels will be deprecated.
-
New Operations: Supports new OPs to cover the majority of TensorFlow 2.15 OPs.
-
Expreimental Support: Continues to provide experimental support for Intel® Arc™ A-Series GPUs on Windows Subsystem for Linux 2 with Ubuntu Linux installed and native Ubuntu Linux.
Known Issues
- TensorList limitation: TensorList is not supported with NextPluggableDevice by TensorFlow 2.15.
- Allocation limitation of WSL: A maximum size of single allocation allowed on a single device is set on the Windows Subsystem for Linux (WSL2), which may cause Out-of-Memory error. Users can remove the limitation with environment variable
UR_L0_ENABLE_RELAXED_ALLOCATION_LIMITS=1
- FP64 support: FP64 is not natively supported by the Intel® Data Center GPU Flex Series platform. If you run any AI workload with the FP64 kernel on that platform, the workload will exit with an exception as
'XXX' Op uses fp64 data type, while fp64 instructions are not supported on the platform.
- GLIBC++ mismatch: A
GLIBC++
version mismatch may cause a workload exit with the exception,Can not find any devices. To check runtime environment on your host, please run itex/tools/python/env_check.py.
Try running env_check.py script to confirm.
Other Information
- Performance Data: Provides a Performance Data document to demonstrate the training and inference performance as well as accuracy results on several popular AI workloads with Intel® Extension for TensorFlow* benchmarked on Intel GPUs.