pypi Keras 3.15.0
v3.15.0

6 hours ago

Highlights

  • Keras-to-Torch Export: New export_torch enables exporting Keras models to native PyTorch nn.Module format, along with LiteRT (TFLite) export support for the PyTorch backend.
  • Sliding Window Attention: Added sliding_window parameter to MultiHeadAttention and GroupedQueryAttention for efficient long-context attention.
  • Flash / Fused SDPA: Causal-only MHA/GQA now automatically dispatches to Flash Attention (cuDNN SDPA), and the manual attention path correctly applies causal masking.
  • Multi-Optimizer Training: New MultiOptimizer supports assigning different optimizers to sub-networks.
  • New Math Operations: Added unique, pinv, matrix_rank, fabs, fmax, fmin, erfc, dsplit, percentile, nanpercentile, sobel_edges, and ssim (structural similarity) to keras.ops.
  • Security Hardening: Comprehensive hardening of model reloading against HDF5 exploits, tar/zip traversal attacks, insecure deserialization.

New Features and Operations

Multi-Backend Operations

  • New NumPy Operations: Added unique, fabs, fmax, fmin, dsplit, erfc, percentile, nanpercentile in keras.ops.numpy.
  • New Linear Algebra Operations: Added pinv (pseudo-inverse) and matrix_rank in keras.ops.linalg.
  • New Image Operations: Added sobel_edges for edge detection and ssim (structural similarity) in keras.ops.image.
  • Negative Axes in Transpose: keras.ops.transpose now supports negative axis values.

Layers and Attention

  • Sliding Window Attention: MultiHeadAttention and GroupedQueryAttention layers support the sliding_window parameter for efficient long-sequence processing.
  • Flash Attention Engagement: Causal-only attention in MHA/GQA now uses Flash SDPA for significant speedups.
  • Fused Bidirectional LSTM/GRU: JAX backend now fuses Bidirectional LSTM into a single cuDNN call; fused bidirectional GRU added for Torch backend.
  • CTC Beam Search Decoder: Added CTC beam search decoding for the Torch backend.

Training and Optimizers

  • MultiOptimizer: Supports training sub-networks with different optimizers.
  • SKLearn Classifier: Added predict_proba method to SKLearnClassifier.

Export and Deployment

  • Keras-to-Torch Export: Export Keras models to native PyTorch nn.Module via model.export(..., format="torch").
  • LiteRT (TFLite) Export for PyTorch: Added LiteRT export support for models using the PyTorch backend.
  • LiteRT Compatibility Fix: Fixed LiteRT export for Keras 3 + TF 2.20 + Python 3.13.
  • ONNX Export: Support for dict/list inputs in Torch ONNX export; documented static input signature requirement for LiteRT PyTorch export.

Distribution and Parallelism

  • ModelParallel Improvements: Defined contiguous replica-group data shard ID convention; added distribution information (num_processes, num_model_replicas, data_shard_id).
  • Initializer Distribution Layout: Initializers can now handle the distribution layout directly with JAX.
  • TF Dataset Distribution: Refactored TF dataset distribution with centralized sharding routing; fixed data distribution for model training in JAX.

OpenVINO Backend Support

The OpenVINO backend received continued improvements:

  • New Operations: Implemented glu, sparsemax, gaussian_blur, logdet, cholesky, lu_factor, erfc, segment_min, segment_prod, percentile, nanmedian, nanpercentile, unique, flash_attn, greedy ctc_decode, solve_triangular, compute_homography_matrix, and image transforms (affine, perspective, elastic).
  • Opset Upgrades: Upgraded to opset16 for select operations and full upgrade.
  • Fixes: Dynamic/symbolic shape handling, mask propagation, random seed determinism, dropout during predict, Lanczos interpolation in resize, dynamic batch shape propagation, and improved efficiency using native ops.

Security

  • HDF5 Hardening: Reject ExternalLink/SoftLink groups, virtual datasets, and shape-bomb datasets in model loading.
  • Archive Hardening: Reject tar members and links escaping extraction directory, ZIP/NPZ members declaring excessive data. Validate asset paths from Orbax checkpoints.
  • Deserialization Safety: Disable pickle in np.load, fix insecure deserialization in dataset utilities, and make Lambda/TorchModuleWrapper from_config fail closed when safe_mode is unset.
  • CI/Workflow: Fix prompt injection in issue triage workflow.

Bug Fixes and Improvements

Backend Specific Improvements

  • PyTorch: Fixed convert_to_tensor for Python scalars, divide_no_nan() NaN gradients, BiLSTM dispatch, lstsq with rcond, SymInt/SymFloat handling in convert_to_tensor and slice, and median for even-length inputs.
  • JAX: Fused Bidirectional LSTM into cuDNN call.
  • TensorFlow: Fixed depthwise/separable conv with stride and dilation. Optimized tf.tensordot by removing redundant float casts.

Layers and Ops

  • Mixed Precision Fix: Fixed float16 numerical instability in GroupNormalization with small epsilon; disabled autocast for mixed precision stability.
  • Dense Layer OOM: Fixed GPU OOM with rank-3 input due to BatchMatMulV2 gradient materialization.
  • GroupQueryAttention: Fixed symbolic output shape with return_attention_scores.
  • Conv Transpose: Save output_padding in Conv1D/2D/3DTranspose get_config.
  • Attention Layer: Fixed stale return_attention_scores flag in compute_output_spec; save seed in get_config.
  • Ops Validation: Added comprehensive axis validation in softmax, normalize, swapaxes, moveaxis, sort, argsort, cumsum, cumprod, take, stack, concatenate, split, diff, transpose, and more.
  • EinsumDense: Fixed compute_output_shape to work before build.
  • Discretization: Fixed bin boundaries calculation.

Model Saving and Loading

  • Nested Sublayers: Fixed save/load for custom models/layers with sublayers in nested lists.
  • Orbax: Fixed bug from Orbax's recent rename from "pytree" to "state".
  • Sequential: Improved error handling for missing keys during deserialization.
  • Pipeline: Validated from_config layers and avoid mutating input config.

Other Improvements

  • Callbacks: Fixed EarlyStopping/ReduceLROnPlateau resetting self.best between fit calls; fixed TensorBoard callback step counter never updating.
  • Progress Bar: Removed double averaging of metrics.
  • LoRA Weights: Use float32 to avoid underflow/overflow risk.
  • Tree Utilities: Optimized tree.flatten and tree.map_structure for common cases.
  • Depthwise/Separable Conv: Removed backend-specific strides + dilation_rate restriction; validated output shapes in build; transposed channels_first to NHWC on CPU.
  • Regularizers: Allow plain callables as regularizers; fixed L1L2 regularizer.
  • Added AI Contribution Policy.
  • Added CITATION.cff for repository citation.

New Contributors

We would like to thank our new contributors for making their first contribution to the Keras project:

Full Changelog: v3.14.0...v3.15.0

Don't miss a new Keras release

NewReleases is sending notifications on new releases.