github pytorch/pytorch v2.4.1
PyTorch 2.4.1 Release, bug fix release

latest releases: ciflow/inductor/140048, ciflow/inductor/139384, ciflow/inductor/139533...
2 months ago

This release is meant to fix the following issues (regressions / silent correctness):

Breaking Changes:

  • The pytorch/pytorch docker image now installs the PyTorch package through pip and has switch its conda installation from miniconda to miniforge (#134274)

Windows:

  • Fix performance regression on Windows related to MKL static linking (#130619) (#130697)
  • Fix error during loading on Windows: [WinError 126] The specified module could not be found. (#131662) (#130697)

MPS:

  • Fix tensor.clamp produces wrong values (#130226)
  • Fix Incorrect result from batch norm with sliced inputs (#133610)

ROCM:

  • Fix for launching kernel invalid config error when calling embedding with large index (#130994)
  • Added a check and a warning when attempting to use hipBLASLt on an unsupported architecture (#128753)
  • Fix image corruption with Memory Efficient Attention when running HuggingFace Diffusers Stable Diffusion 3 pipeline (#133331)

Distributed:

  • Fix FutureWarning when using torch.load internally (#130663)
  • Fix FutureWarning when using torch.cuda.amp.autocast internally (#130660)

Torch.compile:

  • Fix exception with torch compile when onnxruntime-training and deepspeed packages are installed. (#131194)
  • Fix silent incorrectness with torch.library.custom_op with mutable inputs and torch.compile (#133452)
  • Fix SIMD detection on Linux ARM (#129075)
  • Do not use C++20 features in cpu_inducotr code (#130816)

Packaging:

  • Fix for exposing statically linked libstdc++ CXX11 ABI symbols (#134494)
  • Fix error while building pytorch from source due to not missing QNNPACK module (#131864)
  • Make PyTorch buildable from source on PowerPC (#129736)
  • Fix XPU extension building (#132847)

Other:

  • Fix warning when using pickle on a nn.Module that contains tensor attributes (#130246)
  • Fix NaNs return in MultiheadAttention when need_weights=False (#130014)
  • Fix nested tensor MHA produces incorrect results (#130196)
  • Fix error when using torch.utils.flop_counter.FlopCounterMode (#134467)

Tracked Regressions:

  • The experimental remote caching feature for Inductor's autotuner (enabled via TORCHINDUCTOR_AUTOTUNE_REMOTE_CACHE) is known to still be broken in this release and actively worked on in main. Following Error is generated: redis.exceptions.DataError: Invalid input of type: 'dict'. Please use nightlies if you need this feature (reported and Fixed by PR: #134032)

Release tracker #132400 contains all relevant pull requests related to this release as well as links to related issues.

Don't miss a new pytorch release

NewReleases is sending notifications on new releases.