keras-team/keras v3.15.0 on GitHub

Highlights

Keras-to-Torch Export: New export_torch enables exporting Keras models to native PyTorch nn.Module format, along with LiteRT (TFLite) export support for the PyTorch backend.
Sliding Window Attention: Added sliding_window parameter to MultiHeadAttention and GroupedQueryAttention for efficient long-context attention.
Flash / Fused SDPA: Causal-only MHA/GQA now automatically dispatches to Flash Attention (cuDNN SDPA), and the manual attention path correctly applies causal masking.
Multi-Optimizer Training: New MultiOptimizer supports assigning different optimizers to sub-networks.
New Math Operations: Added unique, pinv, matrix_rank, fabs, fmax, fmin, erfc, dsplit, percentile, nanpercentile, sobel_edges, and ssim (structural similarity) to keras.ops.
Security Hardening: Comprehensive hardening of model reloading against HDF5 exploits, tar/zip traversal attacks, insecure deserialization.

New Features and Operations

Multi-Backend Operations

New NumPy Operations: Added unique, fabs, fmax, fmin, dsplit, erfc, percentile, nanpercentile in keras.ops.numpy.
New Linear Algebra Operations: Added pinv (pseudo-inverse) and matrix_rank in keras.ops.linalg.
New Image Operations: Added sobel_edges for edge detection and ssim (structural similarity) in keras.ops.image.
Negative Axes in Transpose: keras.ops.transpose now supports negative axis values.

Layers and Attention

Sliding Window Attention: MultiHeadAttention and GroupedQueryAttention layers support the sliding_window parameter for efficient long-sequence processing.
Flash Attention Engagement: Causal-only attention in MHA/GQA now uses Flash SDPA for significant speedups.
Fused Bidirectional LSTM/GRU: JAX backend now fuses Bidirectional LSTM into a single cuDNN call; fused bidirectional GRU added for Torch backend.
CTC Beam Search Decoder: Added CTC beam search decoding for the Torch backend.

Training and Optimizers

MultiOptimizer: Supports training sub-networks with different optimizers.
SKLearn Classifier: Added predict_proba method to SKLearnClassifier.

Export and Deployment

Keras-to-Torch Export: Export Keras models to native PyTorch nn.Module via model.export(..., format="torch").
LiteRT (TFLite) Export for PyTorch: Added LiteRT export support for models using the PyTorch backend.
LiteRT Compatibility Fix: Fixed LiteRT export for Keras 3 + TF 2.20 + Python 3.13.
ONNX Export: Support for dict/list inputs in Torch ONNX export; documented static input signature requirement for LiteRT PyTorch export.

Distribution and Parallelism

ModelParallel Improvements: Defined contiguous replica-group data shard ID convention; added distribution information (num_processes, num_model_replicas, data_shard_id).
Initializer Distribution Layout: Initializers can now handle the distribution layout directly with JAX.
TF Dataset Distribution: Refactored TF dataset distribution with centralized sharding routing; fixed data distribution for model training in JAX.

OpenVINO Backend Support

The OpenVINO backend received continued improvements:

New Operations: Implemented glu, sparsemax, gaussian_blur, logdet, cholesky, lu_factor, erfc, segment_min, segment_prod, percentile, nanmedian, nanpercentile, unique, flash_attn, greedy ctc_decode, solve_triangular, compute_homography_matrix, and image transforms (affine, perspective, elastic).
Opset Upgrades: Upgraded to opset16 for select operations and full upgrade.
Fixes: Dynamic/symbolic shape handling, mask propagation, random seed determinism, dropout during predict, Lanczos interpolation in resize, dynamic batch shape propagation, and improved efficiency using native ops.

Security

HDF5 Hardening: Reject ExternalLink/SoftLink groups, virtual datasets, and shape-bomb datasets in model loading.
Archive Hardening: Reject tar members and links escaping extraction directory, ZIP/NPZ members declaring excessive data. Validate asset paths from Orbax checkpoints.
Deserialization Safety: Disable pickle in np.load, fix insecure deserialization in dataset utilities, and make Lambda/TorchModuleWrapper from_config fail closed when safe_mode is unset.
CI/Workflow: Fix prompt injection in issue triage workflow.

Bug Fixes and Improvements

Backend Specific Improvements

PyTorch: Fixed convert_to_tensor for Python scalars, divide_no_nan() NaN gradients, BiLSTM dispatch, lstsq with rcond, SymInt/SymFloat handling in convert_to_tensor and slice, and median for even-length inputs.
JAX: Fused Bidirectional LSTM into cuDNN call.
TensorFlow: Fixed depthwise/separable conv with stride and dilation. Optimized tf.tensordot by removing redundant float casts.

Layers and Ops

Mixed Precision Fix: Fixed float16 numerical instability in GroupNormalization with small epsilon; disabled autocast for mixed precision stability.
Dense Layer OOM: Fixed GPU OOM with rank-3 input due to BatchMatMulV2 gradient materialization.
GroupQueryAttention: Fixed symbolic output shape with return_attention_scores.
Conv Transpose: Save output_padding in Conv1D/2D/3DTranspose get_config.
Attention Layer: Fixed stale return_attention_scores flag in compute_output_spec; save seed in get_config.
Ops Validation: Added comprehensive axis validation in softmax, normalize, swapaxes, moveaxis, sort, argsort, cumsum, cumprod, take, stack, concatenate, split, diff, transpose, and more.
EinsumDense: Fixed compute_output_shape to work before build.
Discretization: Fixed bin boundaries calculation.

Model Saving and Loading

Nested Sublayers: Fixed save/load for custom models/layers with sublayers in nested lists.
Orbax: Fixed bug from Orbax's recent rename from "pytree" to "state".
Sequential: Improved error handling for missing keys during deserialization.
Pipeline: Validated from_config layers and avoid mutating input config.

Other Improvements

Callbacks: Fixed EarlyStopping/ReduceLROnPlateau resetting self.best between fit calls; fixed TensorBoard callback step counter never updating.
Progress Bar: Removed double averaging of metrics.
LoRA Weights: Use float32 to avoid underflow/overflow risk.
Tree Utilities: Optimized tree.flatten and tree.map_structure for common cases.
Depthwise/Separable Conv: Removed backend-specific strides + dilation_rate restriction; validated output shapes in build; transposed channels_first to NHWC on CPU.
Regularizers: Allow plain callables as regularizers; fixed L1L2 regularizer.
Added AI Contribution Policy.
Added CITATION.cff for repository citation.

New Contributors

We would like to thank our new contributors for making their first contribution to the Keras project:

@ssam18 made their first contribution in #22617
@chir4gm made their first contribution in #22635
@MalyalaKarthik66 made their first contribution in #22179
@satishkc7 made their first contribution in #22641
@Saumay made their first contribution in #22718
@shashaka made their first contribution in #22679
@AdonaiVera made their first contribution in #22739
@gaga1313 made their first contribution in #22538
@bzantium made their first contribution in #22740
@ShaunakDas88 made their first contribution in #22728
@mgomes0 made their first contribution in #22689
@pctablet505 made their first contribution in #22797
@sharesth23 made their first contribution in #22781
@othakkar made their first contribution in #22847
@rahulrathnavel made their first contribution in #22835
@codewithyug06 made their first contribution in #22444
@JyotinderSingh made their first contribution in #22362
@divakaivan made their first contribution in #21556
@buildwithsuhana made their first contribution in #22903
@LinZiyuu made their first contribution in #22899
@jeffcarp made their first contribution in #22961
@2HParaa made their first contribution in #22974
@dvadym made their first contribution in #23003
@Lawson-Darrow made their first contribution in #23004
@ul611 made their first contribution in #22892
@rni418 made their first contribution in #23026
@peinguim made their first contribution in #23057

Full Changelog: v3.14.0...v3.15.0