Highlights
- Keras-to-Torch Export: New
export_torchenables exporting Keras models to native PyTorchnn.Moduleformat, along with LiteRT (TFLite) export support for the PyTorch backend. - Sliding Window Attention: Added
sliding_windowparameter toMultiHeadAttentionandGroupedQueryAttentionfor efficient long-context attention. - Flash / Fused SDPA: Causal-only MHA/GQA now automatically dispatches to Flash Attention (cuDNN SDPA), and the manual attention path correctly applies causal masking.
- Multi-Optimizer Training: New
MultiOptimizersupports assigning different optimizers to sub-networks. - New Math Operations: Added
unique,pinv,matrix_rank,fabs,fmax,fmin,erfc,dsplit,percentile,nanpercentile,sobel_edges, andssim(structural similarity) tokeras.ops. - Security Hardening: Comprehensive hardening of model reloading against HDF5 exploits, tar/zip traversal attacks, insecure deserialization.
New Features and Operations
Multi-Backend Operations
- New NumPy Operations: Added
unique,fabs,fmax,fmin,dsplit,erfc,percentile,nanpercentileinkeras.ops.numpy. - New Linear Algebra Operations: Added
pinv(pseudo-inverse) andmatrix_rankinkeras.ops.linalg. - New Image Operations: Added
sobel_edgesfor edge detection andssim(structural similarity) inkeras.ops.image. - Negative Axes in Transpose:
keras.ops.transposenow supports negative axis values.
Layers and Attention
- Sliding Window Attention:
MultiHeadAttentionandGroupedQueryAttentionlayers support thesliding_windowparameter for efficient long-sequence processing. - Flash Attention Engagement: Causal-only attention in MHA/GQA now uses Flash SDPA for significant speedups.
- Fused Bidirectional LSTM/GRU: JAX backend now fuses Bidirectional LSTM into a single cuDNN call; fused bidirectional GRU added for Torch backend.
- CTC Beam Search Decoder: Added CTC beam search decoding for the Torch backend.
Training and Optimizers
- MultiOptimizer: Supports training sub-networks with different optimizers.
- SKLearn Classifier: Added
predict_probamethod toSKLearnClassifier.
Export and Deployment
- Keras-to-Torch Export: Export Keras models to native PyTorch
nn.Moduleviamodel.export(..., format="torch"). - LiteRT (TFLite) Export for PyTorch: Added LiteRT export support for models using the PyTorch backend.
- LiteRT Compatibility Fix: Fixed LiteRT export for Keras 3 + TF 2.20 + Python 3.13.
- ONNX Export: Support for dict/list inputs in Torch ONNX export; documented static input signature requirement for LiteRT PyTorch export.
Distribution and Parallelism
- ModelParallel Improvements: Defined contiguous replica-group data shard ID convention; added distribution information (
num_processes,num_model_replicas,data_shard_id). - Initializer Distribution Layout: Initializers can now handle the distribution layout directly with JAX.
- TF Dataset Distribution: Refactored TF dataset distribution with centralized sharding routing; fixed data distribution for model training in JAX.
OpenVINO Backend Support
The OpenVINO backend received continued improvements:
- New Operations: Implemented
glu,sparsemax,gaussian_blur,logdet,cholesky,lu_factor,erfc,segment_min,segment_prod,percentile,nanmedian,nanpercentile,unique,flash_attn,greedy ctc_decode,solve_triangular,compute_homography_matrix, and image transforms (affine, perspective, elastic). - Opset Upgrades: Upgraded to opset16 for select operations and full upgrade.
- Fixes: Dynamic/symbolic shape handling, mask propagation, random seed determinism, dropout during predict, Lanczos interpolation in resize, dynamic batch shape propagation, and improved efficiency using native ops.
Security
- HDF5 Hardening: Reject
ExternalLink/SoftLinkgroups, virtual datasets, and shape-bomb datasets in model loading. - Archive Hardening: Reject tar members and links escaping extraction directory, ZIP/NPZ members declaring excessive data. Validate asset paths from Orbax checkpoints.
- Deserialization Safety: Disable pickle in
np.load, fix insecure deserialization in dataset utilities, and makeLambda/TorchModuleWrapperfrom_configfail closed whensafe_modeis unset. - CI/Workflow: Fix prompt injection in issue triage workflow.
Bug Fixes and Improvements
Backend Specific Improvements
- PyTorch: Fixed
convert_to_tensorfor Python scalars,divide_no_nan()NaN gradients, BiLSTM dispatch,lstsqwith rcond,SymInt/SymFloathandling inconvert_to_tensorandslice, and median for even-length inputs. - JAX: Fused Bidirectional LSTM into cuDNN call.
- TensorFlow: Fixed depthwise/separable conv with stride and dilation. Optimized
tf.tensordotby removing redundant float casts.
Layers and Ops
- Mixed Precision Fix: Fixed float16 numerical instability in
GroupNormalizationwith small epsilon; disabled autocast for mixed precision stability. - Dense Layer OOM: Fixed GPU OOM with rank-3 input due to
BatchMatMulV2gradient materialization. - GroupQueryAttention: Fixed symbolic output shape with
return_attention_scores. - Conv Transpose: Save
output_paddinginConv1D/2D/3DTransposeget_config. - Attention Layer: Fixed stale
return_attention_scoresflag incompute_output_spec; save seed inget_config. - Ops Validation: Added comprehensive axis validation in
softmax,normalize,swapaxes,moveaxis,sort,argsort,cumsum,cumprod,take,stack,concatenate,split,diff,transpose, and more. - EinsumDense: Fixed
compute_output_shapeto work before build. - Discretization: Fixed bin boundaries calculation.
Model Saving and Loading
- Nested Sublayers: Fixed save/load for custom models/layers with sublayers in nested lists.
- Orbax: Fixed bug from Orbax's recent rename from "pytree" to "state".
- Sequential: Improved error handling for missing keys during deserialization.
- Pipeline: Validated
from_configlayers and avoid mutating input config.
Other Improvements
- Callbacks: Fixed
EarlyStopping/ReduceLROnPlateauresettingself.bestbetween fit calls; fixedTensorBoardcallback step counter never updating. - Progress Bar: Removed double averaging of metrics.
- LoRA Weights: Use float32 to avoid underflow/overflow risk.
- Tree Utilities: Optimized
tree.flattenandtree.map_structurefor common cases. - Depthwise/Separable Conv: Removed backend-specific strides + dilation_rate restriction; validated output shapes in build; transposed channels_first to NHWC on CPU.
- Regularizers: Allow plain callables as regularizers; fixed
L1L2regularizer. - Added AI Contribution Policy.
- Added
CITATION.cfffor repository citation.
New Contributors
We would like to thank our new contributors for making their first contribution to the Keras project:
- @ssam18 made their first contribution in #22617
- @chir4gm made their first contribution in #22635
- @MalyalaKarthik66 made their first contribution in #22179
- @satishkc7 made their first contribution in #22641
- @Saumay made their first contribution in #22718
- @shashaka made their first contribution in #22679
- @AdonaiVera made their first contribution in #22739
- @gaga1313 made their first contribution in #22538
- @bzantium made their first contribution in #22740
- @ShaunakDas88 made their first contribution in #22728
- @mgomes0 made their first contribution in #22689
- @pctablet505 made their first contribution in #22797
- @sharesth23 made their first contribution in #22781
- @othakkar made their first contribution in #22847
- @rahulrathnavel made their first contribution in #22835
- @codewithyug06 made their first contribution in #22444
- @JyotinderSingh made their first contribution in #22362
- @divakaivan made their first contribution in #21556
- @buildwithsuhana made their first contribution in #22903
- @LinZiyuu made their first contribution in #22899
- @jeffcarp made their first contribution in #22961
- @2HParaa made their first contribution in #22974
- @dvadym made their first contribution in #23003
- @Lawson-Darrow made their first contribution in #23004
- @ul611 made their first contribution in #22892
- @rni418 made their first contribution in #23026
- @peinguim made their first contribution in #23057
Full Changelog: v3.14.0...v3.15.0